A Lexicographer-Friendly Association Score
نویسنده
چکیده
Finding collocation candidates is one of the most important and widely used feature of corpus linguistics tools. There are many statistical association measures used to identify good collocations. Most of these measures define a formula of a association score which indicates amount of statistical association between two words. The score is computed for all possible word pairs and the word pairs with the highest score are presented as collocation candidates. The same scores are used in many other algorithms in corpus linguistics. The score values are usually meaningless and corpus specific, they cannot be used to compare words (or word pairs) of different corpora. But endusers want an interpretation of such scores and want a score’s stability. This paper present amodification of awell known association scorewhich has a reasonable interpretation and other good features.
منابع مشابه
A picture is worth a thousand words: Using OpenClipArt library to enrich IndoWordNet
WordNet has proved to be immensely useful for Word Sense Disambiguation, and thence Machine translation, Information Retrieval and Question Answering. It can also be used as a dictionary for educational purposes. The semantic nature of concepts in a WordNet motivates one to try to express this meaning in a more visual way. In this paper, we describe our work of enriching IndoWordNet with image ...
متن کاملSpatial justice in the age-friendly city index of Tehran
Background: World Health Organization has proposed age-friendly cities as an urban development approach. Likewise, the spatial distribution of urban facilities can be considered an important issue among urban planners. Method: In 2019, a sample of 770 elderly people was selected by the multi-stage sampling method. Data collection was accomplished using a standard questionnaire of the World Heal...
متن کاملAn Evaluation of a Lexicographer's Workbench Incorporating Word Sense Disambiguation
NLP system developers and corpus lexicographers would both bene t from a tool for nding and organizing the distinctive patterns of use of words in texts Such a tool would be an asset for both language research and lexicon development particularly for lexicons for Machine Translation We have developed the waspbench a tool that presents a word sketch a summary of the corpus evidence for a word to...
متن کاملDILEMMA - An Instant Lexicographer
Dilemma is intended to enhance quality and increase productivity of expert human translators by presenting to the writer relevant lexical information mechanically extracted from comparable existing translations, thus replacing or compensating for the absence of a lexicographer and stand-by terminologist rather than the translator. Using statistics and crude surface analysis and a minimum of pri...
متن کامل